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Abstract —Wyner’s soft-covering lemma is a valuable tool 
for achievablllty proofs of information theoretic security, resolv- 
ablllty, channel synthesis, and source coding. The result herein 
sharpens the claim of soft-covering by moving away from an 
expected value analysis. Instead, a random codebook is shown to 
achieve the soft-covering phenomenon with high probability. The 
probability of failure is doubly-exponentially small in the block- 
length, enabling more powerful applications through the union 
bound. 

1. Claim 

Given a channel Qv\u an input distribution Qu, let 
the output distribution be Qy- Also, let the n-fold memory less 
extensions of these be denoted Qv^\u^^ Qu”-, and Qv^- 

Wyner’s soft-covering lemma iU Theorem 6.3] says that 
the distribution induced by selecting a U" sequence at random 
from an appropriately chosen set and passing this sequence 
through the memoryless channel Qynjj/n will be a good 
approximation of Qv^ in the limit of large n as long as the 
set is of size greater than 2"^ where R > I{U;V). In fact, 
the set can be chosen quite carelessly—^by random codebook 
construction, drawing each sequence independently from the 
distribution Qu^- 

The soft-covering lemmas in the literature use a distance 
metric on distributions (commonly total variation or relative 
entropy) and claim that the distance between the induced 
distribution Pyn and the desired distribution Qv”- vanishes 
in expectation over the random selection of the setQ In the 
literature, El studies the fundamental limits of soft-covering as 
“resolvability,” El provides rates of exponential convergence, 
PI improves the exponents and extends the framework, Q and 
|| 6 ] Chapter 16] refer to soft-covering simply as “covering” in 
the quantum context, El refers to it as a “sampling lemma” 
and points out that it holds for the stronger metric of relative 
entropy, and || 8 ] gives a recent direct proof of the relative 
entropy result. 

Here we give a stronger claim. With high probability 
with respect to the set construction, the distance will vanish 
exponentially quickly with the block-length n. The negligible 
probability of the random set not producing this desired result 
is doubly-exponentially small. 

Let us define precisely the induced distribution. Let C = 
be the set of sequences, which will be referred 
to as the codebook. The size of the codebook is M = 2"^. 
Then the induced distribution is: 

Pyn|c = 2 ^ Qy"|C/"=u’*(m)- (1) 


^Many of the theorems only claim existence of a good codebook, but all 
of the proofs use expected value to establish existence. 


Lemma 1. For any Qu, Qv\U’ tind R > I{U-V), where V 
has a finite support V, there exists a 71 > 0 and a 72 > 0 
such that for n large enough 

P (diPv^icQv^) > (2) 

where •) is the relative entropy. 

Proof: We state the proof in terms of arbitrary dis¬ 
tributions (not necessarily discrete). When needed, we will 
specialize to the case that V is finite. 

Let the Radon-Nikodym derivative between the induced 
and desired distributions be denoted as 

Ddv^ = ( 3 ) 

dQv^ 

In the discrete case, this is just a ratio of probability mass 
functions. 

Notice that the relative entropy of interest, which is a 
function of the codebook C, is given by 

d{Pvn\c,Qvd = j dPvn\c^ogDc. (4) 

Define the jointly-typical set over u and v sequences by 

A ^ + 

(5) 

We split Pvn|c into two parts, making use of the indicator 
function denoted by 1. Let e > 0 be arbitrary, to be determined 
later. 

Pc.l — 2“"^ ^ Qv'‘|(7"=«"(m)l(t/",u"(m))gAe! (6) 

u^(m)GC 

Pc,2 — 2 ^ ^ 

u^(m)GC 

The measures Pc,i Pc, 2 on the space V" are not proba¬ 
bility measures, but Pc 1 P Pc 2 = PV"|c for each codebook 
C. 

Let us also split Dc into two parts: 

Dc,livd = ( 8 ) 

Dc.2{vd = (9) 







By Jensen’s inequality (or the data processing inequality) 
we can upper bound the relative entropy of interest: 

d(Pyn|c, Qv^) < h • ■ • 

+ J dPc^ilogDc^i+ y rf^c,2logUc.2, 

( 10 ) 

where h{-) is the binary entropy function. 


At this point we will use the fact that V is a finite set to 
obtain two bounds. First, 

Dc 2 (v^) < (max — —Vu" € V" w.p.l. (19) 
V«GV Qv{v)J 

Notice that the maximum is only over the support of V, which 
makes this bound finite. The reason this restriction is possible 
is because with probability one a conditional distribution is 
absolutely continuous with respect to its associated marginal 
distribution. 


Notice that will usually contain almost all of the 
probability. That is, denoting the complement of Ae as Ae, 

j dPc ,2 = l- j dPc,i (11) 

= 2-"« ^ PQ(A|C/” = ti”KC)). (12) 

This is an average of exponentially many i.i.d. random vari¬ 
ables bounded between 0 and 1. Furthermore, the expected 
value of each one is the exponentially small probability of 
correlated sequences being atypical: 

E Pq (A I = u^im, C)) = Pq (a) (13) 

< 2"^”, (14) 


Next we use the union bound applied to (fl4l i and (fT^ . 
taking advantage of the fact that the space V" is only expo¬ 
nentially large. Let <S be the set of codebooks such that all of 
the following are true: 


y dPc ,2 < 2 • 
-Dc.iA) < 1 + 2 “'^=” 


Dc 


. 2 A) < ( 


1 


max ■ 


Vu" e V”. 


\v&v Qviv), 

We see that the probability of not being in S is 
exponentially small: 

P(C i S) < 


( 20 ) 

( 21 ) 

( 22 ) 

doubly 

2/32) 

(23) 


where 

/3 = max(a - 1) {Iq{U -V) + e- d^Quy, QuQv )), 

q ;>1 

(15) 

where da{-,-) is the Renyi divergence of order a. We use 
units of bits for mutual information and Renyi divergence to 
coincide with the base two expression of rate. 

Therefore, the Chernoff bound assures that J dPc ^2 is 
exponentially small. That is, for any /3i < /3, 

P ^y dPc ,2 > 2 ■ 2-^1”^ < 


Similarly, Pc.i is an average of exponentially many i.i.d. 
and uniformly bounded functions, each one determined by one 
sequence in the codebook: 


^c.iA) = 2 


—nR 


dQ yn ^jjn —y^n 


u'^{m)GC 


dQv" 


■(t) )!(«'*,u'*(m))eyie 

(17) 


For every term in the average, the indicator function bounds 
the value to be between 0 and The expected 

value of each term with respect to the codebook is bounded 
above by one, which is observed by removing the indicator 
function. Therefore, the Chernoff bound assures that Pc.i is 
exponentially close to one for every r/". For any 


P(A.iA) >1 + 2 -'^=^”) < 


l„n(il-J-Q(t/;V)-e-232) 
6 3 


Vu". 

(18) 


This use of the Chernoff bound has been used before for a 
soft-covering lemma in the proof of Lemma 9 of in. 


What remains is to show that for every codebook in S, the 
relative entropy is exponentially small. We begin from (fTol i. 
Since 


h{x)<x\og—, (24) 

X 

we have 

h (^J dPc.y = ^ (y <^Pc, 2 ^ (25) 

< 2 • 2“^^"(/3inlog2 -f loge — log2). (26) 


Furthermore, 

y dPc,i \ogDc,i < J dPc,i log(l + (27) 

< log(l -f (28) 

<2“'^"" log 6. (29) 


Finally, 


y dPc ,2 log Dc ,2 < J dPc ,2 log 


< n log I max 


1 


i/ev Qv{v) 

< n log ( max ^ 

vdV Qv[v) 


Qv{v) 

J dPc,2 

2 ■ 2 "^i”. 


(30) 

(31) 

(32) 


Note: Relative entropy can be used to bound total variation 
via Pinsker’s inequality. With that approach you lose a factor 
of two in the exponent of decay. On the other hand, the last 
steps of the proof can be modified to produce a total variation 
bound instead of relative entropy. This direct method keeps 
the error exponents the same for the total variation case as it 
is for relative entropy. 








II. Applications 

This stronger version of Wyner’s soft-covering lemma has 
important applications, particularly to information theoretic 
security. The main advantage of this lemma comes from the 
union bound. 

The usual random coding argument for information theory 
uses a randomly generated codebook until the final steps of the 
achievability proof. In this final step, it is claimed that there 
exists a good codebook based on the analysis. This can be done 
by analyzing the expected value of the performance for the 
random ensamble and claiming that at least one codebook is 
as good as the expected value. Alternatively, one can make the 
argument based on the probability that the randomly generated 
codebook has a good performance. If that probability is greater 
than zero, then there is at least one good codebook. The 
second approach can be advantageous when performance is 
not captured by one scalar value that is easily analyzed— 
for example, if “good” performance involves a collection of 
constraints. 

This stronger soft-covering lemma gives a very strong 
assurance that soft-covering will hold. Even if the codebook 
needs to satisfy exponentially many constraints related to soft- 
covering, the union bound will yield the claim that a codebook 
exists which satisfies them all simultaneously. Indeed, if you 
ran the soft-covering experiment exponentially many times, 
regardless of how the codebooks are correlated from one 
experiment to the next, the probability of seeing even one fail 
is still doubly-exponentially small. 

A. Semantic Security 

Wyner’s soft-covering lemma has become a standard tool 
for proving that strong perfect secrecy is achieved in the wire¬ 
tap channel (see e.g. |9l)- Coincidentally, Wyner introduced 
both the idea of soft covering IT] and the wiretap channel mol 
in the same year, but he didn’t connect the two together. 

According to the usual definition, strong perfect secrecy is 
achieved if the mutual information (unnormalized) between the 
message and the eavesdropper’s channel output can be made 
arbitrarily small. 

An even stronger notion of near-perfect secrecy is se¬ 
mantic security. This requires that any two messages cannot 
be distinguished, usually measured by total variation. This 
is not implied by the above strong secrecy because mutual 
information is an average quantity. Since there are so many 
messages, the mutual information can be small even if a few 
of the messages are perfectly distinguishable. 

Semantic security is an operationally relevant metric and 
widely adopted in cryptography. In fTTlI it is shown that 
semantic security is essentially equivalent to stipulating that 
the capacity of the channel from the transmitted message to 
the eavesdropper’s observations is negligible, rather than the 
mutual information with respect to a uniformly distributed 
message. They also show that for some binary channels 
semantic security can be achieved at rates up to Wyner’s 
secrecy capacity. Note that contrary to the claim in ifTSIl . it 
is not sufficient to analyze the random codebook ensemble for 
an arbitrary message distribution in order to claim semantic 


security. A single codebook must work well for all message 
distributions. 

The soft-covering lemma is used in the proof of the wiretap 
channel in the following way. A random codebook is used for 
communication to the intended receiver; however, two digital 
messages are concatenated and fed into the encoder (mapped 
to the codewords); the actual message to be transmitted; and a 
random sequence of bits. This random sequence of bits is what 
provides the secrecy. Since the sequence is random, this means 
that for any individual transmitted message there is a collection 
of codewords from which one is selected uniformly at random 
and transmitted. The soft-covering lemma says that the output 
at the eavesdropper will look i.i.d. if the size of this set if large 
enough. More importantly, this i.i.d. output distribution does 
not depend on the message that was transmitted. 

This argument, using the standard soft-covering lemma 
(expectation with respect to the codebook), is good enough 
to claim that the output distribution is close to the i.i.d. 
distribution on average over the messages. This can then be 
used to claim that the mutual information is small. However, 
for semantic security, it must be claimed that the output 
distribution is close the i.i.d. distribution for all messages, 
and there are exponentially many messages. Here is where 
the stronger soft-covering lemma provided in this work is 
advantageous. Using the stronger lemma we can claim that 
a single codebook exists that accomplishes this for every 
message. 

For the single-transmitter wiretap setting, semantic security 
can be achieved by other means. The expurgation technique 
that is used to bound the maximum error probability in channel 
coding can be used here. Any offending messages, which do 
not produce the desired output distribution at the eavesdropper, 
can be removed from the codebook, and this can be shown 
to only negligibly reduce the message rate. However, this 
expurgation technique will not work in all setting, such as 
the multiple access wiretap channel. On the other hand, the 
proof method involving this stronger soft-covering lemma will 
work in that setting. Thus, strong secrecy can be upgraded to 
semantic security even in situations where vanishing average 
error probability cannot be upgraded to vanishing maximum 
error probability. 

B. Distributed Channel Synthesis 

In previous work a, we characterized the minimum rates 
of communication and common randomness needed to syn¬ 
thesize a memoryless channel, where the channel inputs are 
observed at the location of the transmitter, and the channel 
outputs are produced at the location of the receiver. This 
is referred to as distributed channel synthesis. We say that 
synthesis is achieved if it is not possible to distinguish the 
synthetic channel from the genuine memoryless channel that 
it mimics upon observing the channel inputs and outputs. 

The work in |4] only considers the case where the input is 
a fixed i.i.d. distribution. A stronger claim would be to say that 
the synthetic channel cannot be distinguished from the genuine 
channel even for arbitrary inputs (perhaps with a statistical 
constraint) 0 However, the proof in 11 relies heavily on the 

^This stronger claim was shown independently in the work of □a using 
an entirely different proof. 



soft-covering lemma, and the exponential size of even a single 
type of input sequences made such a claim elusive. A single 
codebook would need to work well for all input sequences, 
but the soft-covering lemma only showed that it would work 
well on average. 

With this stronger soft-covering lemma, it may be possible 
to use the union bound to claim that the soft-covering phe¬ 
nomenon will hold for all of the channel inputs simultaneously. 

C. Wiretap Channel II 

The wiretap channel has been studied in other forms aside 
from the memoryless channel setting. One such variation, 
where the eavesdropper gets to make choices about his own 
channel noise, has been referred to as the Wiretap Channel 
II m. The original formulation was a channel where the 
eavesdropper is allowed to decide which transmission packets 
to observed while being limited in quantity. If the selection 
of observed packets is an i.i.d. process, then this is the 
standard wiretap channel setting with an erasure channel to 
the eavesdropper. The secrecy capacity of the wiretap channel 
type II, where the eavesdropper selects the packets to observe, 
was solved in m only for the case of a noise-free channel to 
the legitimate receiver. Recent work ifTSll investigates the case 
where the channel to the legitimate receiver is also noisy, for 
which the secrecy capacity is yet unknown. 

The challenge in this setting is that the eavesdropper knows 
the codebook when it selects the packets to observe. Therefore, 
secrecy will only be achieved if it is achieved uniformly for all 
selections of packets, of which there are exponentially many 
possibilities. 

Using the lemma provided in this work, it can be shown 
that rates all the way up to the secrecy capacity of the memory¬ 
less erasure channel can be achieved even in this more stringent 
setting. The codebook construction for the wiretap channel 
is symmetric in time, so the secrecy analysis, with respect 
to the random codebook, does not depend on the specific 
choice of packets observed. The remaining step that is needed 
is to show that a single codebook exists which will provide 
secrecy simultaneously for each one of the exponentially many 
observation sequences. This is what the stronger soft-covering 
lemma provides. 
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